Margin Of Error
   HOME

TheInfoList



OR:

The margin of error is a statistic expressing the amount of random
sampling error In statistics, sampling errors are incurred when the statistical characteristics of a population are estimated from a subset, or sample, of that population. Since the sample does not include all members of the population, statistics of the sample ( ...
in the results of a
survey Survey may refer to: Statistics and human research * Statistical survey, a method for collecting quantitative information about items in a population * Survey (human research), including opinion polls Spatial measurement * Surveying, the techniq ...
. The larger the margin of error, the less confidence one should have that a poll result would reflect the result of a census of the entire
population Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction using a ...
. The margin of error will be positive whenever a population is incompletely sampled and the outcome measure has positive
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
, which is to say, the measure ''varies''. The term ''margin of error'' is often used in non-survey contexts to indicate
observational error Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a " mistake ...
in reporting measured quantities.


Concept

Consider a simple ''yes/no'' poll P as a sample of n respondents drawn from a population N \text(n \ll N) reporting the percentage p of ''yes'' responses. We would like to know how close p is to the true result of a survey of the entire population N, without having to conduct one. If, hypothetically, we were to conduct poll P over subsequent samples of n respondents (newly drawn from N), we would expect those subsequent results p_1,p_2,\ldots to be normally distributed about \overline. The ''margin of error'' describes the distance within which a specified percentage of these results is expected to vary from \overline. According to the 68-95-99.7 rule, we would expect that 95% of the results p_1,p_2,\ldots will fall within ''about'' two
standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
s (\plusmn2\sigma_) either side of the true mean \overline.  This interval is called the
confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
, and the ''radius'' (half the interval) is called the ''margin of error'', corresponding to a 95% ''confidence level''. Generally, at a confidence level \gamma, a sample sized n of a population having expected standard deviation \sigma has a margin of error :MOE_\gamma = z_\gamma \times \sqrt where z_\gamma denotes the ''quantile'' (also, commonly, a ''
z-score In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the mean ...
''), and \sqrt is the
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error ...
.


Standard deviation and standard error

We would expect the normally distributed values  p_1,p_2,\ldots to have a standard deviation which somehow varies with n. The smaller n, the wider the margin. This is called the standard error \sigma_\overline. For the single result from our survey, we ''assume'' that p = \overline, and that ''all'' subsequent results p_1,p_2,\ldots together would have a variance \sigma_^2=P(1-P). : \text = \sigma_\overline \approx \sqrt \approx \sqrt Note that p(1-p) corresponds to the variance of a
Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli,James Victor Uspensky: ''Introduction to Mathematical Probability'', McGraw-Hill, New York 1937, page 45 is the discrete probabil ...
.


Maximum margin of error at different confidence levels

For a confidence ''level'' \gamma, there is a corresponding confidence ''interval'' about the mean \mu\plusmn z_\gamma\sigma, that is, the interval mu-z_\gamma\sigma,\mu+z_\gamma\sigma/math> within which values of P should fall with probability \gamma. Precise values of z_\gamma are given by the quantile function of the normal distribution (which the 68-95-99.7 rule approximates). Note that z_\gamma is undefined for , \gamma, \ge 1, that is, z_ is undefined, as is z_. Since \max \sigma_P^2 = \max P(1-P) = 0.25 at p = 0.5, we can arbitrarily set p=\overline = 0.5, calculate \sigma_, \sigma_\overline, and z_\gamma\sigma_\overline to obtain the ''maximum'' margin of error for P at a given confidence level \gamma and sample size n, even before having actual results.  With p=0.5,n=1013 :MOE_(0.5) = z_\sigma_\overline \approx z_\sqrt = 1.96\sqrt = 0.98/\sqrt=\plusmn3.1% :MOE_(0.5) = z_\sigma_\overline \approx z_\sqrt = 2.58\sqrt = 1.29/\sqrt=\plusmn4.1% Also, usefully, for any reported MOE_ :MOE_ = \fracMOE_ \approx 1.3 \times MOE_


Specific margins of error

If a poll has multiple percentage results (for example, a poll measuring a single multiple-choice preference), the result closest to 50% will have the highest margin of error. Typically, it is this number that is reported as the margin of error for the entire poll. Imagine poll P reports p_,p_,p_ as 71%, 27%, 2%, n=1013 :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.89/\sqrt=\plusmn2.8% (as in the figure above) :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.87/\sqrt=\plusmn2.7% :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.27/\sqrt=\plusmn0.8% As a given percentage approaches the extremes of 0% or 100%, its margin of error approaches ±0%.


Comparing percentages

Imagine multiple-choice poll P reports p_,p_,p_ as 46%, 42%, 12%, n=1013. As described above, the margin of error reported for the poll would typically be MOE_(P_), as p_is closest to 50%. The popular notion of ''statistical tie'' or ''statistical dead heat,'' however, concerns itself not with the accuracy of the individual results, but with that of the ''ranking'' of the results. Which is in first? If, hypothetically, we were to conduct poll P over subsequent samples of n respondents (newly drawn from N), and report result p_ = p_ - p_, we could use the ''standard error of difference'' to understand how p_,p_,p_,\ldots is expected to fall about \overline. For this, we need to apply the ''sum of variances'' to obtain a new variance, \sigma_^2 , : \sigma_^2=\sigma_^2 = \sigma_^2 + \sigma_^2-2\sigma_ = p_(1-p_) + p_(1-p_) + 2p_p_ where \sigma_ = -P_P_ is the
covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the les ...
of P_and P_. Thus (after simplifying), : \text = \sigma_ \approx \sqrt = \sqrt = 0.029, P_=P_-P_ : MOE_(P_) = z_\sigma_ \approx \plusmn : MOE_(P_) = z_\sigma_ \approx \plusmn Note that this assumes that P_ is close to constant, that is, respondents choosing either A or B would almost never chose C (making P_and P_ close to ''perfectly negatively correlated''). With three or more choices in closer contention, choosing a correct formula for \sigma_^2 becomes more complicated.


Effect of finite population size

The formulae above for the margin of error assume that there is an infinitely large population and thus do not depend on the size of population N, but only on the sample size n. According to sampling theory, this assumption is reasonable when the
sampling fraction In sampling theory, the sampling fraction is the ratio of sample size to population size or, in the context of stratified sampling, the ratio of the sample size to the size of the stratum. The formula for the sampling fraction is :f=\frac, where ...
is small. The margin of error for a particular sampling method is essentially the same regardless of whether the population of interest is the size of a school, city, state, or country, as long as the sampling ''fraction'' is small. In cases where the sampling fraction is larger (in practice, greater than 5%), analysts might adjust the margin of error using a
finite population correction The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error o ...
to account for the added precision gained by sampling a much larger percentage of the population. FPC can be calculated using the formula (Equation 1) :\operatorname = \sqrt ...and so, if poll P were conducted over 24% of, say, an electorate of 300,000 voters, :MOE_(0.5) = z_\sigma_\overline \approx \frac=\plusmn0.4% :MOE_(0.5) = z_\sigma_\overline\sqrt\approx \frac\sqrt=\plusmn0.3% Intuitively, for appropriately large N, :\lim_ \sqrt\approx 1 :\lim_ \sqrt = 0 In the former case, n is so small as to require no correction. In the latter case, the poll effectively becomes a census and sampling error becomes moot.


See also

*
Engineering tolerance Engineering tolerance is the permissible limit or limits of variation in: # a physical dimension; # a measured value or physical property of a material, manufactured object, system, or service; # other measured values (such as temperature, hum ...
*
Key relevance In master locksmithing, key relevance is the measurable difference between an original key and a copy made of that key, either from a wax impression or directly from the original, and how similar the two keys are in size and shape. It can also re ...
*
Measurement uncertainty In metrology, measurement uncertainty is the expression of the statistical dispersion of the values attributed to a measured quantity. All measurements are subject to uncertainty and a measurement result is complete only when it is accompanied by ...
*
Random error Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a "mistake" ...


References


Sources

* Sudman, Seymour and Bradburn, Norman (1982). ''Asking Questions: A Practical Guide to Questionnaire Design''. San Francisco: Jossey Bass. *


External links

* * {{mathworld , urlname = MarginofError , title = Margin of Error Error Measurement Sampling (statistics) Statistical deviation and dispersion Statistical intervals